Function parser for C++ v2.8 by Warp. ===================================== Optimization code contributed by Bisqwit (http://iki.fi/bisqwit/) The usage license of this library is located at the end of this text file. What's new in v2.8 ------------------ - Put the compile-time options to a separate file fpconfig.hh. - Added comparison operators "!=", "<=" and ">=". - Added an epsilon value to the comparison operators (its value can be changed in fpconfig.hh or it can be completely disabled there). - Added unary not operator "!". - Now user-defined functions can take 0 parameters. - Added a maximum recursion level to the "eval()" function (definable in (fpconfig.hh). Now "eval()" should never cause an infinite recursion. (Note however, that it may still be relevant to disable it completely because it is possible to write functions which take enormous amounts of time to evaluate even when the maximum recursion level is not reached.) - Separated the optimizer code to its own file (makes developement easier). ============================================================================= - Preface ============================================================================= Often people need to ask some mathematical expression from the user and then evaluate values for that expression. The simplest example is a program which draws the graphic of a user-defined function on screen. This library adds C-style function string parsing to the program. This means that you can evaluate the string "sqrt(1-x^2+y^2)" with given values of 'x' and 'y'. The library is intended to be very fast. It byte-compiles the function string at parse time and interpretes this byte-code at evaluation time. The evaluation is straightforward and no recursions are done (uses stack arithmetic). Empirical tests show that it indeed is very fast (specially compared to libraries which evaluate functions by just interpreting the raw function string). The library is made in ISO C++ and requires a standard-conforming C++ compiler. ============================================================================= - Usage ============================================================================= To use the FunctionParser class, you have to include "fparser.hh" in your source code files which use the FunctionParser class. When compiling, you have to compile fparser.cc and fpoptimizer.cc and link them to the main program. In some developement environments it's enough to add those two files to your current project. If you are not going to use the optimizer (ie. you have commented out SUPPORT_OPTIMIZER in fpconfig.hh), you can leave the latter file out. * Configuring the compilation: --------------------------- There is a set of precompiler options in the fpconfig.hh file which can be used for setting certain features on or off: NO_ASINH : (Default on) By default the library does not support the asinh(), acosh() and atanh() functions because they are not part of the ISO C++ standard. If your compiler supports them and you want the parser to support them as well, comment out this line. DISABLE_EVAL : (Default off) Even though the maximum recursion level of the eval() function is limited, it is still possible to write functions which never reach this maximum recursion level but take enormous amounts of time to evaluate (this can be undesirable eg. in web server-side applications). Uncommenting this line will disable the eval() function completely, thus removing the danger of exploitation. Note that you can also disable eval() by specifying the DISABLE_EVAL precompiler constant in your compiler (eg. with -DDISABLE_EVAL in gcc). EVAL_MAX_REC_LEVEL : (Default 1000) Sets the maximum recursion level allowed for eval(). SUPPORT_OPTIMIZER : (Default on) If you are not going to use the Optimize() method, you can comment this line out to speed-up the compilation of fparser.cc a bit, as well as making the binary a bit smaller. (Optimize() can still be called, but it will not do anything.) You can also disable the optimizer by specifying the NO_SUPPORT_OPTIMIZER precompiler constant in your compiler (eg. with -DNO_SUPPORT_OPTIMIZER in gcc). FP_EPSILON : (Default 1e-14) Epsilon value used in comparison operators. If this line is commented out, then no epsilon will be used. * Copying and assignment: ---------------------- The class implements a safe copy constructor and assignment operator. It uses the copy-on-write technique for efficiency. This means that when copying or assigning a FunctionParser instance, the internal data (which in some cases can be quite lengthy) is not immediately copied but only when the contents of the copy (or the original) are changed. This means that copying/assigning is a very fast operation, and if the copies are never modified then actual data copying never happens either. The Eval() and EvalError() methods of the copy can be called without the internal data being copied. Calling Parse(), Optimize() or the user-defined constant/function adding methods will cause a deep-copy. (C++ basics: The copy constructor is called when a new FunctionParser instance is initialized with another, ie. like: FunctionParser fp2 = fp1; // or: FunctionParser fp2(fp1); or when a function takes a FunctionParser instance as parameter, eg: void foo(FunctionParser p) // takes an instance of FunctionParser { ... } The assignment operator is called when a FunctionParser instance is assigned to another, like "fp2 = fp1;".) * Short descriptions of FunctionParser methods: -------------------------------------------- int Parse(const std::string& Function, const std::string& Vars, bool useDegrees = false); Parses the given function and compiles it to internal format. Return value is -1 if successful, else the index value to the location of the error. const char* ErrorMsg(void) const; Returns an error message corresponding to the error in Parse(), or 0 if no such error occurred. ParseErrorType GetParseErrorType() const; Returns the type of parsing error which occurred. Possible return types are described in the long description. double Eval(const double* Vars); Evaluates the function given to Parse(). int EvalError(void) const; Returns 0 if no error happened in the previous call to Eval(), else an error code >0. void Optimize(); Tries to optimize the bytecode for faster evaluation. bool AddConstant(const std::string& name, double value); Add a constant to the parser. Returns false if the name of the constant is invalid, else true. bool AddFunction(const std::string& name, double (*functionPtr)(const double*), unsigned paramsAmount); Add a user-defined function to the parser (as a function pointer). Returns false if the name of the function is invalid, else true. bool AddFunction(const std::string& name, FunctionParser&); Add a user-defined function to the parser (as a FunctionParser instance). Returns false if the name of the function is invalid, else true. * Long descriptions of FunctionParser methods: ------------------------------------------- --------------------------------------------------------------------------- int Parse(const std::string& Function, const std::string& Vars, bool useDegrees = false); --------------------------------------------------------------------------- Parses the given function (and compiles it to internal format). Destroys previous function. Following calls to Eval() will evaluate the given function. The strings given as parameters are not needed anymore after parsing. Parameters: Function : String containing the function to parse. Vars : String containing the variable names, separated by commas. Eg. "x,y", "VarX,VarY,VarZ,n" or "x1,x2,x3,x4,__VAR__". useDegrees: (Optional.) Whether to use degrees or radians in trigonometric functions. (Default: radians) Variables can have any size and they are case sensitive (ie. "var", "VAR" and "Var" are *different* variable names). Letters, digits and underscores can be used in variable names, but the name of a variable can't begin with a digit. Each variable name can appear only once in the 'Vars' string. Function names are not legal variable names. Using longer variable names causes no overhead whatsoever to the Eval() method, so it's completely safe to use variable names of any size. The third, optional parameter specifies whether angles should be interpreted as radians or degrees in trigonometrical functions. If not specified, the default value is radians. Return values: -On success the function returns -1. -On error the function returns an index to where the error was found (0 is the first character, 1 the second, etc). If the error was not a parsing error returns an index to the end of the string + 1. Example: parser.Parse("3*x+y", "x,y"); --------------------------------------------------------------------------- const char* ErrorMsg(void) const; --------------------------------------------------------------------------- Returns a pointer to an error message string corresponding to the error caused by Parse() (you can use this to print the proper error message to the user). If no such error has occurred, returns 0. --------------------------------------------------------------------------- ParseErrorType GetParseErrorType() const; --------------------------------------------------------------------------- Returns the type of parse error which occurred. This method can be used to get the error type if ErrorMsg() is not enough for printing the error message. In other words, this can be used for printing customized error messages (eg. in another language). If the default error messages suffice, then this method doesn't need to be called. FunctionParser::ParseErrorType is an enumerated type inside the class (ie. its values are accessed like "FunctionParser::SYNTAX_ERROR"). The possible values for FunctionParser::ParseErrorType are listed below, along with their equivalent error message returned by the ErrorMsg() method: FP_NO_ERROR : If no error occurred in the previous call to Parse(). SYNTAX_ERROR : "Syntax error" MISM_PARENTH : "Mismatched parenthesis" MISSING_PARENTH : "Missing ')'" EMPTY_PARENTH : "Empty parentheses" EXPECT_OPERATOR : "Syntax error: Operator expected" OUT_OF_MEMORY : "Not enough memory" UNEXPECTED_ERROR : "An unexpected error occurred. Please make a full bug " "report to the author" INVALID_VARS : "Syntax error in parameter 'Vars' given to " "FunctionParser::Parse()" ILL_PARAMS_AMOUNT : "Illegal number of parameters to function" PREMATURE_EOS : "Syntax error: Premature end of string" EXPECT_PARENTH_FUNC: "Syntax error: Expecting ( after function" --------------------------------------------------------------------------- double Eval(const double* Vars); --------------------------------------------------------------------------- Evaluates the function given to Parse(). The array given as parameter must contain the same amount of values as the amount of variables given to Parse(). Each value corresponds to each variable, in the same order. Return values: -On success returns the evaluated value of the function given to Parse(). -On error (such as division by 0) the return value is unspecified, probably 0. Example: double Vars[] = {1, -2.5}; double result = parser.Eval(Vars); --------------------------------------------------------------------------- int EvalError(void) const; --------------------------------------------------------------------------- Used to test if the call to Eval() succeeded. Return values: If there was no error in the previous call to Eval(), returns 0, else returns a positive value as follows: 1: division by zero 2: sqrt error (sqrt of a negative value) 3: log error (logarithm of a negative value) 4: trigonometric error (asin or acos of illegal value) 5: maximum recursion level in eval() reached --------------------------------------------------------------------------- void Optimize(); --------------------------------------------------------------------------- This method can be called after calling the Parse() method. It will try to simplify the internal bytecode so that it will evaluate faster (it tries to reduce the amount of opcodes in the bytecode). For example, the bytecode for the function "5+x*y-25*4/8" will be reduced to a bytecode equivalent to the function "x*y-7.5" (the original 11 opcodes will be reduced to 5). Besides calculating constant expressions (like in the example), it also performs other types of simplifications with variable and function expressions. This method is quite slow and the decision of whether to use it or not should depend on the type of application. If a function is parsed once and evaluated millions of times, then calling Optimize() may speed-up noticeably. However, if there are tons of functions to parse and each one is evaluated once or just a few times, then calling Optimize() will only slow down the program. Also, if the original function is expected to be optimal, then calling Optimize() would be useless. Note: Currently this method does not make any checks (like Eval() does) and thus things like "1/0" will cause undefined behaviour. (On the other hand, if such expression is given to the parser, Eval() will always give an error code, no matter what the parameters.) If caching this type of errors is important, a work-around is to call Eval() once before calling Optimize() and checking EvalError(). If the destination application is not going to use this method, the compiler constant SUPPORT_OPTIMIZER can be undefined in fpconfig.hh to make the library smaller (Optimize() can still be called, but it will not do anything). (If you are interested in seeing how this method optimizes the opcode, you can call the PrintByteCode() method before and after the call to Optimize() to see the difference.) --------------------------------------------------------------------------- bool AddConstant(const std::string& name, double value); --------------------------------------------------------------------------- This method can be used to add constants to the parser. Syntactically constants are identical to variables (ie. they follow the same naming rules and they can be used in the function string in the same way as variables), but internally constants are directly replaced with their value at parse time. Constants used by a function must be added before calling Parse() for that function. Constants are preserved between Parse() calls in the current FunctionParser instance, so they don't need to be added but once. (If you use the same constant in several instances of FunctionParser, you will need to add it to all the instances separately.) Constants can be added at any time and the value of old constants can be changed, but new additions and changes will only have effect the next time Parse() is called. (That is, changing the value of a constant after calling Parse() and before calling Eval() will have no effect.) The return value will be false if the 'name' of the constant was illegal, else true. If the name was illegal, the method does nothing. Example: parser.AddConstant("pi", 3.1415926535897932); Now for example parser.Parse("x*pi", "x"); will be identical to the call parser.Parse("x*3.1415926535897932", "x"); --------------------------------------------------------------------------- bool AddFunction(const std::string& name, double (*functionPtr)(const double*), unsigned paramsAmount); --------------------------------------------------------------------------- This method can be used to add new functions to the parser. For example, if you would like to add a function "sqr(A)" which squares the value of A, you can do it with this method (so that you don't need to touch the source code of the parser). The method takes three parameters: - The name of the function. The name follows the same naming conventions as variable names. - A C++ function, which will be called when evaluating the function string (if the user-given function is called there). The C++ function must have the form: double functionName(const double* params); - The number of parameters the function takes. 0 is a valid value in which case the function takes no parameters (such function should simply ignore the double* it gets as a parameter). The return value will be false if the given name was invalid (either it did not follow the variable naming conventions, or the name was already reserved), else true. If the return value is false, nothing is added. Example: Suppose we have a C++ function like this: double Square(const double* p) { return p[0]*p[0]; } Now we can add this function to the parser like this: parser.AddFunction("sqr", Square, 1); parser.Parse("2*sqr(x)", "x"); An example of a useful function taking no parameters is a function returning a random value. For example: double Rand(const double*) { return drand48(); } parser.AddFunction("rand", Rand, 0); IMPORTANT NOTE: If you use the Optimize() method, it will assume that the user-given function has no side-effects, that is, it always returns the same value for the same parameters. The optimizer will optimize the function call away in some cases, making this assumption. (The Rand() function given as example above is one such problematic case.) --------------------------------------------------------------------------- bool AddFunction(const std::string& name, FunctionParser&); --------------------------------------------------------------------------- This method is almost identical to the previous AddFunction(), but instead of taking a C++ function, it takes another FunctionParser instance. There are some important restrictions on making a FunctionParser instance call another: - The FunctionParser instance given as parameter must be initialized with a Parse() call before giving it as parameter. That is, if you want to use the parser A in the parser B, you must call A.Parse() before you can call B.AddFunction("name", A). - The amount of variables in the FunctionParser instance given as parameter must not change after it has been given to the AddFunction() of another instance. Changing the number of variables will result in malfunction. - AddFunction() will fail (ie. return false) if a recursive loop is formed. The method specifically checks that no such loop is built. Example: FunctionParser f1, f2; f1.Parse("x*x", "x"); f2.AddFunction("sqr", f1); This version of the AddFunction() method can be useful to eg. chain user-given functions. For example, ask the user for a function F1, and then ask the user another function F2, but now the user can call F1 in this second function if he wants (and so on with a third function F3, where he can call F1 and F2, etc). ============================================================================= - The function string ============================================================================= The function string understood by the class is very similar to the C-syntax. Arithmetic float expressions can be created from float literals, variables or functions using the following operators in this order of precedence: () expressions in parentheses first A^B exponentiation (A raised to the power B) -A unary minus !A unary logical not (result is 1 if int(A) is 0, else 0) A*B A/B A%B multiplication, division and modulo A+B A-B addition and subtraction A=B A!=B AB A>=B comparison between A and B (result is either 0 or 1) A&B result is 1 if int(A) and int(B) differ from 0, else 0 A|B result is 1 if int(A) or int(B) differ from 0, else 0 Since the unary minus has higher precedence than any other operator, for example the following expression is valid: x*-y The comparison operators use an epsilon value, so expressions which may differ in very least-significant digits should work correctly. For example, "0.1+0.1+0.1+0.1+0.1+0.1+0.1+0.1+0.1+0.1 = 1" should always return 1, and the same comparison done with ">" or "<" should always return 0. (The epsilon value can be configured in the fpconfig.hh file.) Without epsilon this comparison probably returns the wrong value. The class supports these functions: abs(A) : Absolute value of A. If A is negative, returns -A otherwise returns A. acos(A) : Arc-cosine of A. Returns the angle, measured in radians, whose cosine is A. acosh(A) : Same as acos() but for hyperbolic cosine. asin(A) : Arc-sine of A. Returns the angle, measured in radians, whose sine is A. asinh(A) : Same as asin() but for hyperbolic sine. atan(A) : Arc-tangent of (A). Returns the angle, measured in radians, whose tangent is (A). atan2(A,B): Arc-tangent of A/B. The two main differences to atan() is that it will return the right angle depending on the signs of A and B (atan() can only return values betwen -pi/2 and pi/2), and that the return value of pi/2 and -pi/2 are possible. atanh(A) : Same as atan() but for hyperbolic tangent. ceil(A) : Ceiling of A. Returns the smallest integer greater than A. Rounds up to the next higher integer. cos(A) : Cosine of A. Returns the cosine of the angle A, where A is measured in radians. cosh(A) : Same as cos() but for hyperbolic cosine. cot(A) : Cotangent of A (equivalent to 1/tan(A)). csc(A) : Cosecant of A (equivalent to 1/sin(A)). eval(...) : This a recursive call to the function to be evaluated. The number of parameters must be the same as the number of parameters taken by the function. Must be called inside if() to avoid infinite recursion. exp(A) : Exponential of A. Returns the value of e raised to the power A where e is the base of the natural logarithm, i.e. the non-repeating value approximately equal to 2.71828182846. floor(A) : Floor of A. Returns the largest integer less than A. Rounds down to the next lower integer. if(A,B,C) : If int(A) differs from 0, the return value of this function is B, else C. Only the parameter which needs to be evaluated is evaluated, the other parameter is skipped; this makes it safe to use eval() in them. int(A) : Rounds A to the closest integer. 0.5 is rounded to 1. log(A) : Natural (base e) logarithm of A. log10(A) : Base 10 logarithm of A. max(A,B) : If A>B, the result is A, else B. min(A,B) : If A1, n*eval(n-1), 1)" Note that a recursive call has some overhead, which makes it a bit slower than any other operation. It may be a good idea to avoid recursive functions in very time-critical applications. Recursion also takes some memory, so extremely deep recursions should be avoided (eg. millions of nested recursive calls). Also note that even though the maximum recursion level of eval() is limited, it is possible to write functions which never reach that level but still take enormous amounts of time to evaluate. This can sometimes be undesirable because it is prone to exploitation, but you can disable the eval() function completely in the fpconfig.hh file. ============================================================================= - Contacting the author ============================================================================= Any comments, bug reports, etc. should be sent to warp@iki.fi ============================================================================= - The algorithm used in the library ============================================================================= The whole idea behind the algorithm is to convert the regular infix format (the regular syntax for mathematical operations in most languages, like C and the input of the library) to postfix format. The postfix format is also called stack arithmetic since an expression in postfix format can be evaluated using a stack and operating with the top of the stack. For example: infix postfix 2+3 2 3 + 1+2+3 1 2 + 3 + 5*2+8/2 5 2 * 8 2 / + (5+9)*3 5 9 + 3 * The postfix notation should be read in this way: Let's take for example the expression: 5 2 * 8 2 / + - Put 5 on the stack - Put 2 on the stack - Multiply the two values on the top of the stack and put the result on the stack (removing the two old values) - Put 8 on the stack - Put 2 on the stack - Divide the two values on the top of the stack - Add the two values on the top of the stack (which are in this case the result of 5*2 and 8/2, that is, 10 and 4). At the end there's only one value in the stack, and that value is the result of the expression. Why stack arithmetic? The last example above can give you a hint. In infix format operators have precedence and we have to use parentheses to group operations with lower precedence to be calculated before operations with higher precedence. This causes a problem when evaluating an infix expression, specially when converting it to byte code. For example in this kind of expression: (x+1)/(y+2) we have to calculate first the two additions before we can calculate the division. We have to also keep counting parentheses, since there can be a countless amount of nested parentheses. This usually means that you have to do some type of recursion. The most simple and efficient way of calculating this is to convert it to postfix notation. The postfix notation has the advantage that you can make all operations in a straightforward way. You just evaluate the expression from left to right, applying each operation directly and that's it. There are no parentheses to worry about. You don't need recursion anywhere. You have to keep a stack, of course, but that's extremely easily done. Also you just operate with the top of the stack, which makes it very easy. You never have to go deeper than 2 items in the stack. And even better: Evaluating an expression in postfix format is never slower than in infix format. All the contrary, in many cases it's a lot faster (eg. because all parentheses are optimized away). The above example could be expressed in postfix format: x 1 + y 2 + / The good thing about the postfix notation is also the fact that it can be extremely easily expressed in bytecode form. You only need a byte value for each operation, for each variable and to push a constant to the stack. Then you can interpret this bytecode straightforwardly. You just interpret it byte by byte, from the beginning to the end. You never have to go back, make loops or anything. This is what makes byte-coded stack arithmetic so fast. ============================================================================= Usage license: ============================================================================= Copyright © 2003-2005 Juha Nieminen, Joel Yliluoma This library is distributed under two distinct usage licenses depending on the software ("Software" below) which uses the Function Parser library ("Library" below). The reason for having two distinct usage licenses is to make the library compatible with the GPL license while still being usable in other non-GPL (even commercial) software. A) If the Software using the Library is distributed under the GPL license, then the Library can be used under the GPL license as well. The Library will be under the GPL license only when used with the Software. If the Library is separated from the Software and used in another different software under a different license, then the Library will have the B) license below. Exception to the above: If the Library is modified for the GPL Software, then the Library cannot be used with the B) license without the express permission of the author of the modifications. A modified library will be under the GPL license by default. That is, only the original, unmodified version of the Library can be taken to another software with the B) license below. The author of the Software should provide an URL to the original version of the Library if the one used in the Software has been modified. (http://iki.fi/warp/FunctionParser/) This text file must be distributed in its original intact form along with the sources of the Library. (Documentation about possible modifications to the library should be put in a different text file.) B) If the Software using the Library is not distributed under the GPL license but under any other license, then the following usage license applies to the Library: 1. This library is free for non-commercial usage. You can do whatever you like with it as long as you don't claim you made it yourself. 2. It is possible to use this library in a commercial program, but in this case you MUST contact me first (warp@iki.fi) and ask express permission for this. (Read explanation at the end of the file.) If you are making a free program or a shareware program with just a nominal price (5 US dollars or less), you don't have to ask for permission. In any case, I DON'T WANT MONEY for the usage of this library. It is free, period. 3. You can make any modifications you want to it so that it conforms your needs. If you make modifications to it, you have, of course, credits for the modified parts. 4. If you use this library in your own program, you don't have to provide the source code if you don't want to (ie. the source code of your program or this library). If you DO include the source code for this library, this text file must be included in its original intact form. 5. If you distribute a program which uses this library, and specially if you provide the source code, proper credits MUST be included. Trying to obfuscate the fact that this library is not made by you or that it is free is expressly prohibited. When crediting the usage of this library, it's enough to include my name and email address, that is: "Juha Nieminen (warp@iki.fi)". Also a URL to the library download page would be nice, although not required. The official URL is: http://iki.fi/warp/FunctionParser/ 6. And the necessary "lawyer stuff": The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. The software is provided "as is", without warranty of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose and noninfringement. In no event shall the authors or copyright holders be liable for any claim, damages or other liability, whether in an action of contract, tort or otherwise, arising from, out of or in connection with the software or the use or other dealings in the software. --- Explanation of the section 2 of the B) license above: The section 2 tries to define "fair use" of the library in commercial programs. "Fair use" of the library means that the program is not heavily dependent on the library, but the library only provides a minor secondary feature to the program. "Heavily dependent" means that the program depends so much on the library that without it the functionality of the program would be seriously degraded or the program would even become completely non-functional. In other words: If the program does not depend heavily on the library, that is, the library only provides a minor secondary feature which could be removed without the program being degraded in any considerable way, then it's OK to use the library in the commercial program. If, however, the program depends so heavily on the library that removing it would make the program non-functional or degrade its functionality considerably, then it's NOT OK to use the library. The ideology behind this is that it's not fair to use a free library as a base for a commercial program, but it's fair if the library is just a minor, unimportant extra. If you are going to ask me for permission to use the library in a commercial program, please describe the feature which the library will be providing and how important it is to the program.